obs: telemetry wrapper intermittent writeDataPoint drops (follow-up to #157)#212
Merged
Merged
Conversation
Two consecutive smoke passes against worker_version 0.28.0 (main preview and prod) each emitted only 14 of 17 tool_call rows despite all handlers returning successfully. The dropped tools differ between runs, ruling out per-tool wrapper attachment as the cause. Most likely cause: writeDataPoint enqueue lost when SSE response closes isolate before AE flush completes. Fix: wrap emit in ctx.waitUntil. Not a promotion blocker — wrapper works on most calls, residual drop is strictly better than the pre-wrapper 27% wire-edge gap. Per-tool correctness (bytes_in matches no-space JSON of args exactly when emitted) is verified.
Canon Quality — Frontmatter Schema ✅All 41 file(s) in Validator: |
Canon Quality —
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Diagnostic observation documenting a residual gap in the per-tool
withTelemetrywrapper introduced byklappy/oddkit#157and promoted byklappy/oddkit#162.What
Two smoke passes against
worker_version = '0.28.0'(main preview + prod) each dropped 2 telemetry rows out of 16 successful handler returns. Different tools dropped on different runs. Retries emit cleanly.Suggested fix
Wrap the wrapper's emit in
ctx.waitUntil()to extend isolate lifecycle through the AE flush. Small change, no architecture impact.Why filed here and not as an issue
The PAT scope for this work covers contents + PRs but not issues on
klappy/oddkit. Canon observation is the natural durable format for this knowledge anyway — the diagnostic belongs in the searchable knowledge base, not buried in a GitHub issue tracker.Not blocking the shipped promotion
klappy/oddkit#162already shipped. The wrapper works on most calls; the residual drop is strictly better than the pre-wrapper 27% wire-edge gap. This is a follow-up fix, not a regression.Note
Low Risk
Docs-only change adding an internal observation; no production code or runtime behavior is modified.
Overview
Adds a new Canon observation documenting intermittent loss of
writeDataPointtelemetry rows from thewithTelemetrywrapper despite successful tool execution.Captures smoke-test evidence across preview/prod and proposes a follow-up implementation fix (wrapping emission in
ctx.waitUntil) to avoid Cloudflare Workers isolate lifecycle flush races.Reviewed by Cursor Bugbot for commit e75e559. Bugbot is set up for automated code reviews on this repo. Configure here.